The transfer-matrix technique is a convenient way for studying strip latticesin the Potts model since the compu- tational costs depend just on the periodicpart of the lattice and not on the whole. However, even when the cost isreduced, the transfer-matrix technique is still an NP-hard problem since thetime T(|V|, |E|) needed to compute the matrix grows ex- ponentially as afunction of the graph width. In this work, we present a paralleltransfer-matrix implementation that scales performance under multi-corearchitectures. The construction of the matrix is based on several repetitionsof the deletion- contraction technique, allowing parallelism suitable tomulti-core machines. Our experimental results show that the multi-coreimplementation achieves speedups of 3.7X with p = 4 processors and 5.7X with p= 8. The efficiency of the implementation lies between 60% and 95%, achievingthe best balance of speedup and efficiency at p = 4 processors for actualmulti-core architectures. The algorithm also takes advantage of the latticesymmetry, making the transfer matrix computation to run up to 2X faster thanits non-symmetric counterpart and use up to a quarter of the original space.
展开▼
机译:转移矩阵技术是在Potts模型中研究带状晶格的便捷方法,因为计算成本仅取决于晶格的周期部分,而不取决于整体。但是,即使降低了成本,转移矩阵技术仍然是NP-hard问题,因为计算矩阵所需的时间T(| V |,| E |)随着图形宽度的增加而呈指数增长。在这项工作中,我们提出了一个并行传输矩阵实现,该实现可在多核体系结构下扩展性能。矩阵的构建基于删除收缩技术的几次重复,从而使并行性适合于多核计算机。我们的实验结果表明,多核实现在p = 4的处理器下实现了3.7倍的加速,而在p = 8的情况下实现了5.7倍的加速。实现效率在60%到95%之间,在p =时实现了加速和效率的最佳平衡。实际多核体系结构的4个处理器。该算法还利用了晶格对称性,使传输矩阵计算的运行速度比非对称副本快2倍,并且占用了原始空间的四分之一。
展开▼